High Performance Pattern Matching on Heterogeneous Platform

نویسندگان

  • Shima Soroushnia
  • Masoud Daneshtalab
  • Juha Plosila
  • Tapio Pahikkala
  • Pasi Liljeberg
چکیده

Pattern discovery is one of the fundamental tasks in bioinformatics and pattern recognition is a powerful technique for searching sequence patterns in the biological sequence databases. Fast and high performance algorithms are highly demanded in many applications in bioinformatics and computational molecular biology since the significant increase in the number of DNA and protein sequences expand the need for raising the performance of pattern matching algorithms. For this purpose, heterogeneous architectures can be a good choice due to their potential for high performance and energy efficiency. In this paper we present an efficient implementation of Aho-Corasick (AC) which is a well known exact pattern matching algorithm with linear complexity, and Parallel Failureless Aho-Corasick (PFAC) algorithm which is the massively parallelized version of AC algorithm without failure transitions, on a heterogeneous CPU/GPU architecture. We progressively redesigned the algorithms and data structures to fit on the GPU architecture. Our results on different protein sequence data sets show that the new implementation runs 15 times faster compared to the original implementation of the PFAC algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel computing using MPI and OpenMP on self-configured platform, UMZHPC.

Parallel computing is a topic of interest for a broad scientific community since it facilitates many time-consuming algorithms in different application domains.In this paper, we introduce a novel platform for parallel computing by using MPI and OpenMP programming languages based on set of networked PCs. UMZHPC is a free Linux-based parallel computing infrastructure that has been developed to cr...

متن کامل

Process Introspection: A Checkpoint Mechanism for High Performance Heterogeneous Distributed Systems

The Process Introspection project is a design and implementation effort, the main goal of which is to construct a general purpose, flexible, efficient checkpoint/restart mechanism appropriate for use in high performance heterogeneous distributed systems. This checkpoint/restart mechanism has the primary constraint that it must be platform independent; that is, checkpoints produced on one archit...

متن کامل

Developing Hardware-Based Applications Using PRESENCE-2

The AICP (Ambient Intelligent Co-Processor) project aims are to develop and implement high performance hardware pattern matching algorithms for use in embedded ubiquitous systems. As part of this project we aim to implement the pattern-matching algorithms onto the PRESENCE2 hardware platform. PRESENCE-2 is a PCI-based accelerator card for high performance applications, designed and built here i...

متن کامل

Parallelization of Multiple String Matching on a Cluster Platform

This work proposes four parallel methods for multipattern matching which are executed on a heterogeneous cluster. These parallel methods are based on the master worker paradigm and they implements different partitioning schemes such as static and dynamic load balancing. Furthermore, the parallel methods are analyzed experimentally using the Message Passing Interface (MPI) library on a cluster o...

متن کامل

A Dynamic Workload Balancing Technique of a Text Matching Algorithm on a Cluster

A dynamic workload allocation model which utilizes a data pool manager is investigated herein. It aims at heterogeneous multicomputer environments and the implementation is written in C using the MPICH NT 1.2.5 message passing interface for Microsoft Windows based clusters. The algorithm utilized is the data parallel adaptation of the Brute Force exact pattern matching algorithm. Performance ev...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of integrative bioinformatics

دوره 11 3  شماره 

صفحات  -

تاریخ انتشار 2014